This assignment is for ETC5521 Assignment 1 by Team WOMBAT comprising of Hai Hanh Ngo, Dewi Lestari Amaliah, Siyi Li, and Priya Ravindra Dingorkar.

1 Acknowledgements

Our sincere gratitude goes out for their guidance and encouragement to Professor Dianne Cook, Dr Emi Tanka, and the tutors Sayani Gupta and Sherry Zhang. Their culminating efforts have placed us in a position where we can generate this report collaboratively.

2 Introduction and Motivation

Getting data from the hotel industry has always been a real challenge. Although we can read about hotel jargons and words easily, searched through some of the standard descriptive statistics of the hotel industry in market research reports on the internet for the price of your next summer holiday, lesser is known about how the hotel actually works behind the scenes. In this article let us have sneak into the hotel data in detail and come up with interesting findings.

The “hotel booking demand datasets” compiled by Nuno Antonio, Ana de Almeida and Luis Nunes (Antonio, Almeida, and Nunes 2019) was a beautiful effort to overcome such challenge. This dataset was obtained from two hotels in Portugal - one city hotel in Lisbon and one resort in Algrave. Some of the sensitive information that could reveal the identity of the two hotels was not provided, but it did not affect the important role of this dataset for the purpose of education, management, machine learning and many others. Let us analyze and find out some intersting findings about the different workings of the hotels.

3 Research Questions

3.1 Primary Research Questions

Let us compare the efficiency of the two hotels, the City Hotel and the Resort Hotel, and try to understand the business of the hotels and what they really look like behind the scenes. Let try and find, the interesting fact and findings through our analysis.

3.2 Secondary Research Questions

  • Which months across the two year, saw the most inflow of the tourist ?

  • Let’s try to figure out the average daily rate of both types of hotels and figure out which type of hotel makes more money

  • Exploring the Hotels’ Market Segment, Customers’s preference for Booking

  • Which segment of the hotel market is more profitable and lets customers book their trip easily ?

  • Where Did The Bookings Come From ?

  • Let’s see, How do international guests like to reside in the hotels of Portugal ? How long do they they live there ?

  • Lets us further find out if there any relationship, between the customer type and the ADR(Average Daily Rate) in the two different hotel types

  • We, do not know, whether the meal type of the hotel data, can be inferred for the information of the length of stay of the guest. Let us try and find out if possible !!

  • In the generation of the added human intervention, cars play a huge role lets us find out whether there is any relationship between the the car parking space in the two different types of hotels mentioned in the dataset.

4 Data description

4.1 Dateset overview and structure

The datasets for the two hotels can be downloaded separately at ScienceDirect.com in the paper of Antonio, Almeida, and Nunes (2019). However, Mock (2020) at tidytuesday challenge had done us a favor and combined the two datasets into one. The two original datasets and the combined one can be obtained at this GitHub page.

For this study, we used only the combined dataset which is stored in a csv file format with 32 variables and 119,390 observations. Each observation represents one hotel booking.

Knowing this dataset belonged to a hotel, we can make sense most of the variables. However, not all of them are familiar for anyone who did not have a background of hotel management. We will run through some of the industry jargons and variables’meaning before we take a further look at the data.

Variables note:

  • is_cancelled: (1) if the booking was cancelled and (0) if not.
  • lead_time: number of days between when the booking was entered into the hotel’s booking system and the |arrival date.
  • meal: Type of meal booked which can be:
    • BB: Bed and Breakfast.
    • FB: Full board (breakfast, lunch and dinner).
    • HB: Half board (breakfast and one meal, usually dinner).
    • Undefined/ SC: no meal package.
  • country: Guests’country of origin.
  • market_segment: guests’market segment, some of which may associate with booking channel.
    • Direct: guests that make bookings directly with the hotels, could be from hotel’s website/ phone booking or walk-ins.
    • Corporate: Guests whom bookings are made by corporate/ company or guests who are business travellers.
    • Online TA: Online travel agents - bookings that made through a third party websites. Examples are Agoda, Expedia, Booking.com…
    • Offline TA/TO: Bookings made by Travel agents or Tour operators.
    • Complementary: Free stays offered for guests, usually from hotels’ promotional programs.
    • Groups: guests who travelled in groups.
    • Undefined: Undefined type of guests.
    • Aviation: We are not entirely sure but this could be airline crews.
  • Distribution_channel:
    • Direct: bookings that made directly with the hotels (hotel’s websites, phone or walk-ins)
    • Corporate: bookings made by corporate/company.
    • TA/TO: Travel agents/ tour operators/
    • Undefined: Undefined distribution channel.
    • GDS: Global Distribution System. GDS served like a hub for companies in the travel industry (airlines, hotels, car rental…) to connect with travel agents. Hotels will put some of their inventories (rooms) to the GDS and travel agents then can sell those rooms to their customers. Some of the well-known GDS include Amadeus, Sabre and Galileo.
  • is_repeated_guess: (1) if repeated and (0) if not.
  • customer_type:
    • Transient: Individuals or groups that occupy less than 10 rooms per night. These guests usually stay in the hotel short - term and require little services.
    • Contract: bookings bound by contracts, usually for more than 30 days for a consistent block of rooms.
    • Transient - Party: Transient booking but associated to other transient booking.
    • Group: bookings associated to a group, usually occupy more than 10 rooms per night.
  • previous_cancellations: number of previous cancellations prior to current booking by a customer.
  • previous_bookings_not_canceled: number of previous non - cancelled bookings prior to current booking by a customer.
  • booking_changes: number of changes made to the booking from when it was enterred into the system till the day of arrival/ cancellation.
  • agent: ID of travel agency that made the bookings.
  • company: ID of companies that made the bookings.
  • days_in_waiting_list: number of days booking was in the waiting list before it was confirmed to customers.
  • adr: Average Daily Rate, computed by taking total room revenue (excluded breakfast, tax and service charges) divided by total number of room nights sold.

Kindly note that you may notice that some variables contained the “NULL” values (eg: agent or company variable). This “NULL” value did not mean the value was missing, rather such value did not exist to begin with; for example a booking may not have the ID of an agent or a company associated with it as such booking was made by an individual customer.

4.2 Limitation

  • This dataset contained data for two specific hotels in Portugal. Such limitation in study objects introduced challenges when we attempted to explain the trends observed as reasons could be hotel- specific and we could not use industry knowledge to cover.
  • Some information of the hotels were not provided, for example the number of rooms, the occupancy rate, the location of the hotels (in the busy district or at the city suburban), years of operation or special events that might have occurred. The lack of information might render some of our questions unanswered.
  • Eventhough we were provided with the collection method, we were not be able to verify the validity and correctness of the data. We noticed during our analysis that some of the entries were not sensible and could very likely due to the input errors. However we were unable to verify such concern.
  • Since the observation began in July 2015 and ended in August 2017, we only have a fully cover data by year in 2016. Moreover, the coverage of the dataset in 2015 and 2017 are only six months and eight months, respectively. Hence, we would not analyze the data in year-wise manner because it might be not apple to apple to be compared.

4.3 Collection methods

Antonio, Almeida, and Nunes (2019) collected the data by extracting the variables from the hotels’ PMS (Property Management System) databases’ server with a TSQL query in SQL Server Studio Manager. The tables that were used to extract the variables are:

  1. BO (booking table in which the key, which is the ID, was retrieved).
  2. BL (bookings change log, in this case, if the booking details with respect to the day before arrival changed, the value used was the one present in this table).
  3. ML (meals).
  4. DC (distribution channel).
  5. TR (transaction).
  6. CP (customer profiles).
  7. NT (nationalities).
  8. MS (market segments).

A diagram below made by Antonio, Almeida, and Nunes (2019) presented the structure of the PMS databases:

PMS database diagrams

Figure 4.1: PMS database diagrams

4.4 Data Cleaning

4.4.1 Missing values checking

Bennett (2001) argued that it is important to take missing value into account, otherwise the statistical analysis will be misleading and variability of the data could not be estimated correctly. Thus, before analyzing the data, we checked the missing value using visdat package (Tierney 2017) first.

Variables Type and Missing Value Visualization

Figure 4.2: Variables Type and Missing Value Visualization

In children variable, when we found the missing value and we imputing it with the average of children (mean imputation) (Kang 2013) and created new variable called imputed_children. We added this newly created column in the original dataset.

Figure 4.2 shows that there is no missing value in the dataset otherwise. It is inline with what Antonio, Almeida, and Nunes (2019) stated that there is no missing values in the database table. However, we must take a note that some “NULL” values were presented which should be interpreted as “not applicable”, not a missing value (Antonio, Almeida, and Nunes 2019). For example, if the the company value is NULL, it means that the booking was not made by a company.

4.4.2 Data Transformation

The data is transformed to make it more structured. Properly structured and validated data boost data quality. Datatypes determine the visualisation, hence the correct datatypes, helps determine more accurate and better results. Keeping that in mind we have transformed the data type using mutate function from tidyverse Wickham et al. (2019) and as.factor function from R built-in base package (R Core Team 2020). Figure 4.3 portrays that the variables have been in a correct type.

Data Type Visualization

Figure 4.3: Data Type Visualization

We also created some variables by transforming or wrangling the original variable to analyze the data. Those variables are listed as follows:

  1. is_canceled_new. This variable is actually the same with is_canceled. We only recoded the value from 0 and 1 to be “not canceled” and “canceled” in order to make it easier to interpret.
  2. lengt_of_stay is the number of days that the guests spend in the hotel. It is the summation of stays_in_weekend_nights and stays_in_week_nights variable.
  3. stay_on is a categorical variable to observe whether the guests stayed in weekend, weekday, or both.
  4. number_of_guest is a summation of adults, imputed_children, and babies variables.
  5. kids is a summation of imputed_children and babies variables.
  6. family_type is a categorical variable to observe whether the guest stayed with kids or not.

The lengt_of_stay and stay_on would be used to analyze the staying pattern on the guests. Whilst number_of_guest, kids, and family_type would be used to analyze the type of guest who stayed at the hotel. Moreover, the dataset only provides the country of bookings with country codes coded in ISO code. Hence, we transform these code to the country name using countrycode package (Arel-Bundock, Enevoldsen, and Yetman 2018).

5 Analysis and Findings

Tourism and Seasonality in Portugal

The tourism industry worries a lot about seasonality, as it will affect the flow of visitors to tourist destinations. The hotel season is divided into two main seasons: high and low seasons. As the name suggests, high season is a busy season when the weather is good and the guests’ inflows are high; low season vice versa.

Portugal is no exception to this. The high season in Portugal usually runs in summer (June to September) and in spring (January to March); the beaches are usually the busiest in July and August. The low season generally occurs during the winter season, which begins around November and ends at the end of February. Weather during this time can display rainfall, unexpected rain and a strong, cold breeze which is not too ideal for sightseeing (lisbonlisboaportugal.com, n.d.).

Keeping the season in mind, we are interested in finding out any potential effect of seasonality on the guest count and the ADR of our hotels in the report.

5.1 Which months across the two year, saw the most inflow of the tourist ?

The inflow of visitors to these two hotels will help us decide the months in Portugal are the best time to travel to Portugal, and which hotel is the place to stay when you travel to Portugal.

Count of Number of Guests by each month in the two Different Years

Figure 5.1: Count of Number of Guests by each month in the two Different Years

Months are described in the order of occurrence (July was the first year of each year) to maintain the chronological order of the dataset. The overall pattern has been nearly the same for both hotels and years. The seasonality pattern was similar to the “W” shape, with the lower points of W occurring in the winter months from November to January, and the high points in the spring and summer periods. Interestingly, both hotels had the highest number of guests in the spring season in Year 1 (May and March) while in Year 2, the highest tourism was recorded in the summer season (August and October).

We can therefore infer that most of the months are a good time to visit Protugal, particularly from July to October and January to March, the graph above shows that most of the guests prefer to stay in City Hotel compared to the Resort Hotel.

5.2 Let’s try to figure out the average daily rate of both types of hotels and figure out which type of hotel makes more money

The average daily rate (ADR) calculates the average rental income received in an occupied room per day. The operating performance of a hotel or other lodging company can be calculated by using the ADR.

Average Daily Rate Versus Hotel Type

Figure 5.2: Average Daily Rate Versus Hotel Type

Figure 5.2 indicates a substantial gap between the two hotels. The resort had the highest summer time prices, which makes sense because Algrave is a beach town. Both hotels reported the lowest winter prices, but less fluctuation was observed by City hotel than by Resort hotel.

5.3 Exploring the Hotels’ Market Segment, Customers’s preference for Booking in the two hotel types

According to Meier (2017), one key to being effective in hotel management is a specific market segment, particularly to set the price correctly. It is therefore important to identify the market segment distribution for these hotels.

A study from Phocuswright in pegs.com (2016) stated that one of the major reason of why do the people book trough OTA is beacuse the website is easy to use. Meanwhilst, Howe (2017) argued that OTAs are favorable because its easy booking option. According to Jedina and Ranjinib (2017) in Talwar et al. (2020), the factors for booking the hotel through OTA are the accessibility, pricing, review accountability, and the customer services. Let us try to find out, whether this holds true for the Portugal hotels

The distribution of hotel market segment in the city and resort hotel (2015-2017)

Figure 5.3: The distribution of hotel market segment in the city and resort hotel (2015-2017)

Figure 5.3 shows that the vast majority of Online Travel Agency (OTA). We can clearly distinguish that, customers generally, make their bookings through online or offline agency and groups for City Hotel bookings. On the other hand, we witness that most customer make their resort hotel bookings through corporate or direct booking made by the customers. Moreover, in the city hotel the portion of bookings via OTA was greater than in the resort hotel. Unlike the city hotel, in the resort hotel we could see more portion of individual reservations than party bookings. This proves the study from Phocuswright, and hence people like to ease in booking their trips, hence they use the agency bookings.

5.4 Which segment of the hotel market is more profitable and lets customers book their trip easily ?

Study of which is the best way to book your ticket would allow customers to select their services when booking trips to the hotel

The distribution of hotel market segment in the city and resort hotel by semester

Figure 5.4: The distribution of hotel market segment in the city and resort hotel by semester

OTA has been the major player in the business segment of these hotels, it only took over the supremacy of group reservations in the city hotel in a half year period. In the first semester of 2016, the proportion of the OTA was doubled than in the previous semester.

The bookings through OTA have dominated since the beginning of the time observed, while at the resort hotel in comparison to the city hotel, the bookings via OTA marginally decreased in the first semester of 2016 and the group bookings increased or customers started booking on their own. We could also see that both hotels had hit the peak in the proportion of OTA bookings in semester 2 of 2016.

Aside from OTA matter, Figure 5.4 shows another interesting fact that in the condition of OTA booking was dominating the market segment, the proportion of direct booking in the resort hotel was relatively stable, this could be the reason may be cause this resort has their own way to promote their direct booking, for example may be through a loyalty voucher

5.5 Where Did The Bookings Come From ?

Another way to obtain more business information is to look at the roots of the travellers who book hotel space. An understanding of their actions and preference is essential. Therefore the hoteliers will establish strategies for attracting them.

We may examine which part of the world is most drawn to Portuguese

Figure 5.5: The distibution of travelers origin who books the hotels in 2015-2017

Figure 5.5 provides a booking map of the country of origin to get a view of the booking distribution, throughout the globe, The hover options, helps us with the count of the guests who have visited Portugal. If you hover over the map, it tell us that Portugal sees more tourists from Europe than the rest of the continents

As this could be a topic of interest for most of you and each individual, may want to know the count of guests that have visited Portugal. The interactive table below, allows you to get a detail view. The table suggests that the bookings came from 177 different countries.

5.6 Let’s see the effect of ADR on the two types of hotels with different types of guests.

Let’s try to find descriptive statistics on how the average daily rate impacts the various categories of guests in the hotel, respectively. Have the foreign guests been helping to increase the hotel profit. Lets us find out.

Figure 5.6: Distribution of ADR by guest origin in the city and resort hotel

We have analysed that most bookings in both hotels are from international travelers and the ADR median presented in Figure 5.6 Shows that foreign guests in both types of hotels paid more money than the locals. Therefore we might argue that these travelers have become the hotels’ valued customers. The hover option, gives us more detail in the summary statistics and we can infer which 25 percentile/75 percentile values of ADR in Euros. This will be a good insight for us, when we have a certain limit on the expenditure for trips can keep this percentiles as reference and evaluate our desired expenditure

Since the international travellers have become the hotel’s valued guest, it is easier to retain them through personalization as a potential long-term customer. According to Criton (2019) personalization is the secret to the customer’s heart winning.

5.7 Let’s see, How do international guests like to reside in the hotels of Portugal ? How long do they they live there ?

With regard to the behavior that the guest could conduct, we would like to figure it out by looking at the time that the guest stayed in which hotels. Lets us find out which hotel was preferred for a longer stay.

The distribution of time the guest stayed at the hotels

Figure 5.7: The distribution of time the guest stayed at the hotels

Seems like there in the Portugal a lot to explore, from our analysis we note that most international guests want to stay for a longer period of time. We infer from figure 5.7 that, most tourist that travel to Portugal, like to stay for a longer period of time that is more than the weekend. Lets us try and understand, which hotel do the tourist like to stay in for their stay and does number of days of stay, give us insight on the the preference of the hotel.

Length of stay of international guest in the city hotel

Figure 5.8: Length of stay of international guest in the city hotel

Figure 5.8 helps us infer that if the stay is longer than the weekend and extends one or two day, than we see that the percent of bookings is greater in the city hotel. But on closer look, we understand that if the traveller stays for a longer time that is week and above, we might infer that, that the tourist would live in the resort hotel.

5.8 Lets us further find out if there any relationship, between the customer type and the ADR(Average Daily Rate) in the two different hotel types

During the analysis of this question, we will find out which customer types contribute the most towards which hotel type. Before we move towards our analysis we need to understand the type of these customers: - Transient: Individuals or groups that occupy less than 10 rooms per night. These guests usually stay in the hotel short - term and require little services.
- Contract: bookings bound by contracts, usually for more than 30 days for a consistent block of rooms. - Transient - Party: Transient booking but associated to other transient booking.
- Group: bookings associated to a group, usually occupy more than 10 rooms per night.

Relationship, between the customer type and the ADR(Average Daily Rate

Figure 5.9: Relationship, between the customer type and the ADR(Average Daily Rate

In the figure 5.9, we can infer that Individuals or groups that occupy less than 10 rooms per night can also be called as Transient customer, contribute more towards the City Hotel in terms of the ADR. Customers bound by contract, also contribute a slight more towards City hotel as compared to the resort hotel. But when we have bookings associated to a group, usually occupying more than 10 rooms per night, this type of customers will mostly choose to reside at the resort hotel and contribute towards it’s ADR. This finding also aligns with the previous research question about the length of stay. So we can infer that, when temporary customers or contract based customers visit Portugual, they will be mostly be residing at the City Hotel, thus contributing to the profits of the ADR in that particular hotel.

5.9 Whether the meal type of the hotel data, can be inferred for the information of the length of stay of the guest.

Let us try and infer the relationship between the meal type the customers has choosen versus the length of stay of that customer.

Relationship between Meals and Length of Stay

Figure 5.10: Relationship between Meals and Length of Stay

Figure 5.10, helps us determine the relationship between, the length of stay of the different customers in the two different types of the hotel based upon the meal type. Lets us first look in to the City Hotel, we infer that, if the length of stay is around just for a day or so, then the customer choose to have only the breakfast. If the customer books a trip for more than five days he is mostly likely to choose a full board meal, else the customers choose a half board meal. Whereas, if the see the scenario, in the Resort Hotel, we infer that most customers here choose the full board or half board meal, as we have witness that the customers in the resort hotel generally reside longer than three days.

5.10 Whether there is any relationship between the the car parking space in the two different types of hotels mentioned in the dataset.

We will like to find out, if the car parking has some important role to play while the customers, and see how this varies, with the different variables

Figure 5.11: Different Guest Types and Car Parking

According to the figure 5.11 graph of the required number of car parking spaces against guest type, majority of the people whether the international or local do not need a car parking space, but there is a small percentage of people need one car parking space(4200 times belong to international guests, 3077 belong to local guest).

Figure 5.12: Different Family Types and Car Parking

According to the figure 5.12 graph of the required number of car parking spaces against family type, majority of the people whether the family or not family do not need a car parking space, but there is a small percentage of people need one car parking space(1850 times belong to family, 6288 belong to not family).

Figure 5.13: Hotel Types and Parking

According to the figure 5.13 graph of the required number of car parking spaces against hotel type, majority of the people whether the family or not family do not need a car parking space, but there is a small percentage of people need one car parking space(6402 times belong to transient, 133 belong to contract,797 belong to transient-party and 51 belong to group).

Figure 5.14: Customer Types and Parking

According to the figure figure 5.14 graph of the required number of car parking spaces against customer type, majority of the people whether the family or not family do not need a car parking space, but there is a small percentage of people need one car parking space(1921 times belong to city hotel, 5462 belong to resort hotel). It indicates that people who live in resort hotel prefer a car parking space, and i think that they may have a holiday here. There is an interesting thing, and resort hotel is asked twice by visitors with 8 car parking spaces. all the visitors are not family. it is pretty surprising since these people are transient-party rather than Group.

We mainly incorporated the function from tidyverse (Wickham et al. 2019), ggplot2 (Wickham et al. 2020), and plotly (Sievert et al. 2020). We also used lubridate (Spinu, Grolemund, and Wickham 2020), kableExtra (Zhu 2019), gridExtra (Auguie 2017), DT (Xie, Cheng, and Tan 2020), maps (Brownrigg 2018), viridis (Garnier 2018a), and viridisLite (Garnier 2018b) throughout the visualization.

6 Conclusion

In this study, we tried to explore the dataset in many angels. We compared the two hotels featured in this study across the different variables provided in the dataset.

We found out that the two hotels followed the overall seasonality trend in Portugal where high season falling in the spring and summer time. The ADR for two hotels were priced at a different rates with City hotel observed less fluctuation than Resort did. Also, by market segment, the ADR of the OTA and Direct booking channel appeared to be quite competitive even though the OTA’s prices were still a bit better. We explored that most of the customers book their trips for city hotel through travel agency whereas the resort hotel receive most direct or corporate bookings. We also did some research on which market segment is profiting the most semester wise and we saw that people have started booking their tickets more through online travel agency. We also saw the different places around the globe where Portugal received its tourist from and found that more European citizens, like to visit Portugal.

We further saw whether it was international guest that visited to these hotels or were these the local citizens, our analysis allowed us to infer that the percentage of international guest visitors was higher compared to local citizens. In addition, we concluded that how the different customers in these two different hotels contributed in ADR of these hotels and found out that only the group booking contributed most towards the Resort hotel, rest of them increased the ADR of the City Hotel. The meal type booked by different customers, helped us find out the information of the length of stay in these two hotels, furthermore we also analyzed about the parking information.

Overall, we really liked working on this project, it really makes us look forward to working on the exploratory data analysis on the different datasets.

Reference

Antonio, Nuno, Ana de Almeida, and Luis Nunes. 2019. “Hotel Booking Demand Datasets.” Data in Brief 22: 41–49.

Arel-Bundock, Vincent, Nils Enevoldsen, and CJ Yetman. 2018. “Countrycode: An R Package to Convert Country Names and Country Codes.” Journal of Open Source Software 3 (28): 848. https://doi.org/10.21105/joss.00848.

Auguie, Baptiste. 2017. GridExtra: Miscellaneous Functions for "Grid" Graphics. https://CRAN.R-project.org/package=gridExtra.

Bennett, Derrick A. 2001. “How Can I Deal with Missing Data in My Study?” Australian and New Zealand Journal of Public Health 25 (5): 464–69.

Brownrigg, Ray. 2018. Maps: Draw Geographical Maps. https://CRAN.R-project.org/package=maps.

Criton. 2019. “The Importance of Personalisation in the Hospitality Industry.” https://www.criton.com/news-hub/the-importance-of-personalisation-in-the-hospitality-industry/.

Garnier, Simon. 2018a. Viridis: Default Color Maps from ’Matplotlib’. https://CRAN.R-project.org/package=viridis.

———. 2018b. ViridisLite: Default Color Maps from ’Matplotlib’ (Lite Version). https://CRAN.R-project.org/package=viridisLite.

Howe, Neil. 2017. “Hotels Versus Otas: Who Is Winning over Millenial Travelers?” https://www.forbes.com/sites/neilhowe/2017/07/31/hotels-versus-otas-who-is-winning-over-millennial-travelers/#27440fd5277a.

Jedina, Mohd Haniff, and Kohila Ranjinib. 2017. “Exploring the Key Factors of Hotel Online Booking Through Online Travel Agency.” In 4th International Conference on E-Commerce (Icoec) 2017 Held in Malaysia.

Kang, Hyun. 2013. “The Prevention and Handling of the Missing Data.” Korean Journal of Anesthesiology 64 (5): 402.

lisbonlisboaportugal.com. n.d. “When to Visit Lisbon? The Best Time of Year for a Holiday to Lisbon and Wather.” https://lisbonlisboaportugal.com/lisbon-tour/lisbon-weather-when-to-go-visit.html.

Meier, Veit. 2017. “Market Segmentation - Know Where Your Hotel Demand Comes from.” https://www.bernerbecker.com/latest-articles/market-segmentation-know-hotel-demand-comes/.

pegs.com. 2016. “Why Do Travellers Prefer Booking with Otas?” https://www.pegs.com/blog/why-do-travelers-prefer-booking-with-otas/.

R Core Team. 2020. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Sievert, Carson, Chris Parmer, Toby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. 2020. Plotly: Create Interactive Web Graphics via ’Plotly.js’. https://CRAN.R-project.org/package=plotly.

Spinu, Vitalie, Garrett Grolemund, and Hadley Wickham. 2020. Lubridate: Make Dealing with Dates a Little Easier. https://CRAN.R-project.org/package=lubridate.

Talwar, Shalini, Amandeep Dhir, Puneet Kaur, and Matti Mäntymäki. 2020. “Why Do People Purchase from Online Travel Agencies (Otas)? A Consumption Values Perspective.” International Journal of Hospitality Management 88.

Tierney, Nicholas. 2017. “Visdat: Visualising Whole Data Frames.” JOSS 2 (16): 355. https://doi.org/10.21105/joss.00355.

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.

Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, and Dewey Dunnington. 2020. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. https://CRAN.R-project.org/package=ggplot2.

Xie, Yihui, Joe Cheng, and Xianying Tan. 2020. DT: A Wrapper of the Javascript Library ’Datatables’. https://CRAN.R-project.org/package=DT.

Zhu, Hao. 2019. KableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.